Concurrency and Parallelism

CIS 193 – Go Programming

Prakhar Bhandari, Adel Qalieh

CIS 193

Course Logistics

File I/O

Simple I/O with ioutil

import "io/ioutil"

func main() {
    // Read entire file
    b, err := ioutil.ReadFile("input.txt")
    if err != nil {
        log.Fatal(err)
    }

    // Write entire file
    err = ioutil.WriteFile("output.txt", b, 0644)
    if err != nil {
        log.Fatal(err)
    }
}

What's bad about loading the entire file into memory?

File I/O with bufio

Provides buffered I/O

Creating a read buffer

import (
    "bufio"
    "io"
    "os"
)

func main() {
    // Open intput file
    fi, err := os.Open("input.txt")
    if err != nil {
        log.Fatal(err)
    }

    // Read buffer
    r := bufio.NewReader(fi)
    ...
}

File I/O with bufio

Creating a write buffer

func main() {
    ...
    // Open output file
    fo, err := os.Create("output.txt")
    if err != nil {
        log.Fatal(err)
    }

    // Write buffer
    w := bufio.NewWriter(fo)
    ...
}

File I/O with bufio

Copying a file

func main() {
    // []byte buffer for each chunk
    buf := make([]byte, 1024)
    for {
        n, err := r.Read(buf) // Read a chunk
        if err != nil && err != io.EOF {
            log.Fatal(err)
        }
        if n == 0 {
            break
        }

        // Write chunk
        if _, err := w.Write(buf[:n]); err != nil {
            log.Fatal(err)
        }
    }

    if err = w.Flush(); err != nil {
        log.Fatal(err)
    }
}

Concurrency

A system where several processes are executing at the same time - potentially interacting with each other

Concurrency is about dealing with many things at the same time

Has more to do with system design than execution - concurrency is a design property of a program where two or more tasks can be in progress at the same time (but not necessarily executing at the same time)

Examples?

Parallelism

Computation where many calculations are being done simultaneously

Often used for situations where large problems can be divided into smaller ones, which are solved in parallel

Parallelism is doing lots of things at once - run-time property where two or more tasks are being executed simultaneously

Examples?

Concurrency vs Parallelism

Concurrency != Parallelism

Concurrency is about dealing with lots of things at once, Parallelism is about doing lots of things at once

Concurrency is a way to structure a program by breaking it up into pieces that can execute independently

Concurrency can let us structure a problem in a way that may (or may not) be parallelizable

Concurrency vs Parallelism Example

Example adapted from Rob Pike's talk Concurrency is not parallelism

Problem: Move a pile of obsolete books to the incinerator

What components make up this task?

Concurrency vs Parallelism Example

Problem: Move a pile of obsolete books to the incinerator

What components make up this task?

How can we speed this up?

Concurrency vs Parallelism Example

One solution: Add another gopher and cart!

The pile and incinerator will have bottlenecks - we need to add some sort of communication between the gophers

Concurrency vs Parallelism Example

This is concurrent composition!

Is this parallel?

Concurrency vs Parallelism Example

The previous design isn't automatically parallel - however, it can automatically be made parallel

Concurrent designs aren't necessarily parallel, but can enable parallelism

Concurrency vs Parallelism Example

Another concurrent design

Each gopher has a specific simple task

Potentially four times faster than the original one gopher design

Concurrency vs Parallelism Example

We can now parallelize this system

Is this necessarily parallel?

Concurrency vs Parallelism Example

There are lots of other concurrent designs

Once an efficient concurrent design has been made, parallelization can be added in

What does this example relate to?

Concurrency vs Parallelism Example

Substitute:

We can think about this example as a concurrent design for a scalable web service

Race Conditions

A situation where the program gives incorrect results for certain interleavings of the operations

In Go, this will happen with operations of multiple goroutines (more on these next week)

Race Conditions Example

// Package bank implements a bank with one joint account.
package bank

var balance int

func Deposit(amount int) { 
    balance = balance + amount 
}

func Balance() int { 
    return balance 
}

Any sequential calls to Deposit and Balance should give correct results

Are they always sequential?

Race Conditions Example

Let's add two functions for this joint account

func Rob() {
    bank.Deposit(100)                                // R1
    fmt.Println("Bank balance is ", bank.Balance())  // R2
}

func Ken() {
    bank.Deposit(200)                                // K
}

Possible interleavings:

Are there any issues with these orderings?

Race Conditions Example

Are there any more cases?

Break up Rob's deposit into two components: read and write

func Deposit(amount int) { 
    balance = balance + amount 
}

Consider this ordering:

R1r    // balance(R1r) = 0
K      // balance(K) = 200
R1w    // balance(R1w) = balance(R1r) + 100
R2     "Bank balance is 100"

How can we prevent this?

Mutual Exclusion (mutex)

One potential solution: use mutual exclusion - only allow one concurrent process to access the shared variables at a time

Only the process with access to the "token" is allowed to do work

Mutexes in Go

func (m *Mutex) Lock()
// Lock locks m. If the lock is already in use, the calling
// goroutine blocks until the mutex is available.

func (*Mutex) Unlock()
// Unlock unlocks m. It is a run-time error if m is not locked on entry to Unlock.

Mutex Example

import "sync"
var (
  mu      sync.Mutex // guards balance
  balance int
)
func Deposit(amount int) {
    mu.Lock()
    balance = balance + amount
    mu.Unlock()
}

func Balance() int {
    mu.Lock()
    b := balance
    mu.Unlock()
    return b
}

sync.RWMutex

In the previous example, it is safe for two concurrent operations to access Balance(), as long as no Deposit() calls are being made

sync.RWMutex is a mutex that allows read-only operations to proceed in parallel with each other, but makes sure write operations have fully exclusive access

See sync.RWMutex for more details

Interesting Aside: Timing functions

Use the time package

Useful functions

How to time a function?

func timerFunc(start time.Time, name string) {
    elapsed := time.Since(start)
    fmt.Printf("%s took %s\n", name, elapsed)
}

func SampleFunc() {
    defer timerFunc(time.Now(), "SampleFunc")
    // ...
}

Homework 4

Thank you

Prakhar Bhandari, Adel Qalieh

CIS 193

Use the left and right arrow keys or click the left and right edges of the page to navigate between slides.
(Press 'H' or navigate to hide this message.)